-
Notifications
You must be signed in to change notification settings - Fork 893
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BP-47 (task5): Garbage collection support direct IO entrylogger #3256
Merged
eolivelli
merged 6 commits into
apache:master
from
hangc0276:chenhang/garbage-collection-support-directio-entrylogger
May 5, 2022
Merged
BP-47 (task5): Garbage collection support direct IO entrylogger #3256
eolivelli
merged 6 commits into
apache:master
from
hangc0276:chenhang/garbage-collection-support-directio-entrylogger
May 5, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
GC support requires that the entrylogger provides a way to retrieve all entrylogs which have been completely flushed to disk. Previously this was done by returning the least unflushed log id. However, this is problematic as it doesn't support the log ids wrapping around. It also means that GC has to start checking for log id existence from zero every time it boots. This change replaces getLeastUnflushedLogId() with getFlushedLogIds(), to give the entrylogger full control of which logs should be considered for GC. It also changes the CompactableLedgerStorage interface, removing getEntryLogger() and adding injection of the entrylogger to the GarbageCollectionThread. This makes testing easier. (cherry picked from commit 853dd84)
rerun failure checks |
1 similar comment
rerun failure checks |
eolivelli
approved these changes
May 4, 2022
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
rerun failure checks |
merlimat
approved these changes
May 4, 2022
rerun failure checks |
Merged
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
Task 5 of BP-47, Garbage collection support direct IO entrylogger.
GC support requires that the entrylogger provides a way to retrieve all entrylogs which have been completely flushed to disk. Previously this was done by returning the least unflushed log id. However, this is problematic as it doesn't support the log ids wrapping around. It also means that GC has to start checking for log id existence from zero every time it boots.
Changes
This change replaces
getLeastUnflushedLogId()
withgetFlushedLogIds()
, to give the entrylogger full control of which logs should be considered for GC.It also changes the
CompactableLedgerStorage
interface, removinggetEntryLogger()
and adding injection of the entrylogger to the GarbageCollectionThread. This makes testing easier.Master Issue: #2943 , Subtask of #2932
Others
The commit is made by @ivankelly . This is the sub task of #2932 , which pushed out by @mauricebarnum. However, this PR contains too many changes and the number of code lines reaches 8K+, and it is hard to review. According to your suggestion #2932 (comment), and after communicate with @mauricebarnum, we are planing to divide the PR into 6 PRs, Please refer to #2943 (comment).
However, @mauricebarnum doesn't have enough time to deal with those issues, and we desperately need this feature. After communicated with @mauricebarnum and @merlimat , I help to work on divide the PRs and push them out. We are hoping to contain this feature in BookKeeper 4.16.0